Regularization by Adding Redundant Features
نویسندگان
چکیده
The Pseudo Fisher Linear Discriminant (PFLD) based on a pseudo-inverse technique shows a peaking behaviour of the generalization error for training sample sizes that are about the feature size: with an increase in the training sample size the generalization error at first decreases reaching the minimum, then increases reaching the maximum at the point where the training sample size is equal to the data dimensionality and afterwards begins again to decrease. A number of ways exist to solve this problem. In this paper it is shown that noise injection by adding redundant features to the data also helps to improve the generalization error of this classifier for critical training sample sizes.
منابع مشابه
A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)
Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...
متن کاملRough Hypercuboid Based Supervised Regularized Canonical Correlation for Multimodal Data Analysis
One of the main problems in real life omics data analysis is how to extract relevant and non-redundant features from high dimensional multimodal data sets. In general, supervised regularized canonical correlation analysis (SRCCA) plays an important role in extracting new features from multimodal omics data sets. However, the existing SRCCA optimizes regularization parameters based on the qualit...
متن کاملExploiting Model Capacity by Constraining Within-batch Features to Be Orthogonal
Deep networks have been shown to greatly benefit from large model capacity when trained using various recent deep learning techniques. But at the same time, features in such large capacity networks have a potential to be redundant. In this work, we propose a new regularization method to exploit the given network capacity effectively. By minimizing the redundancy among in-layer filters and the c...
متن کاملEstimating Tool Damage & Remaining Useful Life of a CNC milling cutter by applying Time-Frequency Analysis, Machine Learning and Evolutionary Optimization
The constant evolution of machinery and the increased degree of automation along with advances in technological knowledge have given rise to predictive maintenance (PM), a maintenance scheme that can diagnose the current state of machinery or even predict its remaining life based on collected data. In the scope of this thesis, a PM framework is designed for the estimation of tool damage and rem...
متن کاملSparse auto-associative neural networks: theory and application to speech recognition
This paper introduces the sparse auto-associative neural network (SAANN) in which the internal hidden layer output is forced to be sparse. This is achieved by adding a sparse regularization term to the original reconstruction error cost function, and updating the parameters of the network to minimize the overall cost. We show applicability of this network to phoneme recognition by extracting sp...
متن کامل